# **Implementation** of Write Allocate in the K86™ **Processors** # Application Note Publication # 21326 Rev: E Issue Date: November 1998 Amendment/0 This document contains information on a product under development at Advanced Micro Devices (AMD). The information is intended to help you evaluate this product. AMD reserves the right to change or discontinue work on this proposed product without notice. #### © 1998 Advanced Micro Devices, Inc. All rights reserved. Advanced Micro Devices, Inc. ("AMD") reserves the right to make changes in its products without notice in order to improve design or performance characteristics. The information in this publication is believed to be accurate at the time of publication, but AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication or the information contained herein, and reserves the right to make changes at any time, without notice. AMD disclaims responsibility for any consequences resulting from the use of the information included in this publication. This publication neither states nor implies any representations or warranties of any kind, including but not limited to, any implied warranty of merchantability or fitness for a particular purpose. AMD products are not authorized for use as critical components in life support devices or systems without AMD's written approval. AMD assumes no liability whatsoever for claims associated with the sale or use (including the use of engineering samples) of AMD products, except as provided in AMD's Terms and Conditions of Sale for such products. #### **Trademarks** AMD, the AMD logo, K6, and combinations thereof, K86, and AMD-K5 are trademarks, and RISC86 and AMD-K6 are registered trademarks of Advanced Micro Devices, Inc. MMX is a trademark of Intel Corporation. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies. # **Contents** | Revision History vii | |----------------------------------------------------------------------------------------| | What is Write Allocate? | | Programming Details | | Step 1: Determine Processor Model and Stepping 2 | | Write Allocate Support | | MSR Format | | Step 2: AMD-K6 <sup>®</sup> Processor Models 6, 7 and AMD-K6-2 Processor Model 8/[7:0] | | Write Handling Control Register (WHCR) 3 | | Step 2: AMD-K6-2 Processor Model 8/[F:8] 5 | | Write Handling Control Register (WHCR) 5 | | Step 3: AMD-K6 Processor6 | | AMD-K6 Processor Programming Example for Write Allocate Registers | | Case 1 | | Code Sample 7 | | Case 2 | | Code Sample 7 | | Case 3 | | Code Sample 8 | | Step 2: AMD-K5 <sup>TM</sup> Processor | | Step 3: AMD-K5 Processor | | AMD-K5 Processor Programming Example for Write Allocate Registers | | Case 1 | | Code Sample 12 | | Case 2 | | Code Sample | Contents iii 21326E/0-November 1998 *iv* Contents # **List of Figures** | Figure 1. | Write Handling Control Register (WHCR)— | |-----------|---------------------------------------------------| | | MSR C000_0082h (Models 6, 7, and 8/[7:0]) | | Figure 2. | Write Handling Control Register (WHCR)— | | | MSR C000_0082h (Model 8/[F:8]) | | Figure 3. | Write Allocate Top-of-Memory and Control Register | | | (WATMCR)—MSR 85h | | Figure 4. | Write Allocate Programmable Memory Range Register | | | (WAPMRR)—MSR 86h 11 | | Figure 5. | Hardware Configuration Register (HWCR)— | | | MSR 83h | List of Figures 21326E/0-November 1998 vi List of Figures # **Revision History** | Date | Rev | Description | |-----------|-----|------------------------------------------------------------------------------------------------------------------------| | Sept 1997 | С | Modified description of WCDE bit in Write Handling Control Register (WHCR) Model-Specific Register. See pages 3 and 7. | | May 1998 | D | Switched order of the AMD-K6 and AMD-K5 information. | | May 1998 | D | Revised "Step 1: Determine Processor Model and Stepping" on page 2. | | May 1998 | D | Revised "Step 2: AMD-K6® Processor Models 6, 7 and AMD-K6®-2 Processor Model 8/[7:0]" on page 3. | | Nov 1998 | E | Added "Step 2: AMD-K6®-2 Processor Model 8/[F:8]" on page 5. | | Nov 1998 | E | Added case 3 and code sample to "AMD-K6® Processor Programming Example for Write Allocate Registers" on page 7. | Revision History vii 21326E/0-November 1998 **viii** Revision History # Application Note # Implementation of Write Allocate in the K86™ Processors # What is Write Allocate? Write allocate, if enabled, occurs when the processor has a pending memory write cycle to a cacheable line and the line does not currently reside in the L1 data cache. In this case, the processor performs a burst read cycle to fetch the cache line addressed by the pending write cycle. The data associated with the pending write cycle is merged with the recently-allocated cache line and stored in the processor's L1 data cache. The final MESI (Modified, Exclusive, Shared, Invalid) state of the cache line depends on the state of the WB/WT# and PWT signals during the burst read cycle and the subsequent cache write hit. During the write allocation, a 32-byte burst read cycle is executed in place of a non-burst write cycle. While the burst read cycle generally takes longer to execute than the write cycle, performance gains are realized on subsequent write cycle hits to the write-allocated cache line. Due to the nature of software, memory accesses tend to occur within proximity of each other (principle of locality). The likelihood of additional write hits to the write-allocated cache line is high. What is Write Allocate? # **Programming Details** The steps required for programming write allocate on K86™ processors are as follows: - 1. Verify write allocate support by using the CPUID instruction to check for the correct model and stepping of the processor. - 2. Configure the Model-Specific Registers (MSRs). - 3. Enable write allocate. **Note:** The BIOS should enable the write allocate mechanisms only after performing any memory sizing or typing algorithms. # **Step 1: Determine Processor Model and Stepping** The first step in supporting the write allocate feature of the AMD K86 processors is determining the model and stepping of the processor. # Write Allocate Support Write allocate is supported on every stepping for every model of the AMD-K6<sup>®</sup> processor. Write allocate in the AMD-K5<sup>TM</sup> processor is supported only on the following models with a stepping of 4 or greater: Models 1, 2, and 3. Use the CPUID instruction to determine if the proper model and stepping of the processor is present. See the *AMD Processor Recognition Application Note*, order# 20734 for more information. #### **MSR Format** After determining that the processor supports write allocate, the next step is to configure the corresponding MSR that enables write allocate. This MSR on the AMD-K6 processor is the Write Handling Control Register (WHCR), which has two formats. AMD-K6 processor Models 6 and 7 and AMD-K6-2 processor Model 8 steppings 0 through 7 use the same WHCR format. AMD-K6-2 processor Model 8 steppings 8 through F use a different WHCR format. For AMD-K6 processors, go to either "Step 2: AMD-K6<sup>®</sup> Processor Models 6, 7 and AMD-K6<sup>®</sup>-2 Processor Model 8/[7:0]" on page 3 or "Step 2: AMD-K6<sup>®</sup>-2 Processor Model 8/[F:8]" on page 5. For an AMD-K5 processor Models 1, 2, or 3 with a stepping of 4 or greater, go to "Step 2: AMD-K5<sup>TM</sup> Processor" on page 9. # Step 2: AMD-K6 $^{\mathbb{R}}$ Processor Models 6, 7 and AMD-K6 $^{\mathbb{R}}$ -2 Processor Model 8/[7:0] The AMD-K6 processor uses two mechanisms (programmable within the WHCR) to determine when to perform write allocates. A write allocate is performed when either of these mechanisms detects that a pending write is to a cacheable area of memory. Before programming the WHCR or changing memory cacheability/writeability, the BIOS must writeback and invalidate the internal cache by using the WBINVD instruction. In addition, the WHCR should enable the write allocate mechanisms only after performing any memory sizing or typing algorithms. # Write Handling Control Register (WHCR) The WHCR contains three fields—the WCDE bit, the Write Allocate Enable Limit (WAELIM) field, and the Write Allocate Enable 15-to-16-Mbyte (WAE15M) bit (See Figure 1). **Note**: Hardware RESET initializes this MSR to all zeros. Figure 1. Write Handling Control Register (WHCR)-MSR C000\_0082h (Models 6, 7, and 8/[7:0]) **WCDE.** For proper functionality, always program bit 8 of WHCR to 0. Write Allocate Enable Limit. The WAELIM field is 7 bits wide. This field, multiplied by 4 Mbytes, defines an upper memory limit. Any pending write cycle that addresses memory below this limit causes the processor to perform a write allocate. Write allocate is disabled for memory accesses at and above this limit unless the processor determines a pending write cycle is cacheable by means of one of the other write allocate mechanisms—Write to a Cacheable Page and Write to a Sector (for more information, see the Cache chapter in the *AMD-K6*® *Processor Data Sheet*, order# 20695 or the *AMD-K6*®-2 *Processor Data Sheet*, order# 21850). The maximum value of this limit is $((2^7-1) \cdot 4 \text{ Mbytes}) = 508 \text{ Mbytes}$ . When all the bits in this field are set to 0, all memory is above this limit and the write allocate mechanism is disabled. Once the BIOS determines the amount of RAM installed in the system, this number should also be used to program the WAELIM field. For example, a system with 32 Mbytes of RAM would program the WAELIM field with the value 0001000b. This value (8), when multiplied by 4 Mbytes, yields 32 Mbytes as the write allocate limit. Write Allocate Enable 15-to-16-Mbyte. The WAE15M bit is used to enable write allocations for the memory write cycles that address the 1 Mbyte of memory between 15 Mbytes and 16 Mbytes. This bit must be set to 1 to allow write allocates in this memory area. This sub-mechanism of the WAELIM provides a memory hole to prevent write allocates. This memory hole is provided to account for a small number of uncommon memory-mapped I/O adapters that use this particular memory address space. If the system contains one of these peripherals, the bit should be set to 0. The WAE15M bit is ignored if the value in the WAELIM field is set to less than 16 Mbytes. By definition, write allocations in the AMD-K6 are never performed in the memory area between 640 Kbytes and 1 Mbyte unless the processor determines a pending write cycle is cacheable by means of Write to a Cacheable Page or Write to a Sector. It is not safe to perform write allocations between 640 Kbytes and 1 Mbyte (000A\_0000h to 000F\_FFFFh) because it is considered a noncacheable region of memory. To complete programming write allocate on the AMD-K6 processor, go to "Step 3: AMD-K6® Processor" on page 6. # Step 2: AMD-K6<sup>®</sup>-2 Processor Model 8/[F:8] The AMD-K6-2 processor uses two mechanisms (programmable within the WHCR) to determine when to perform write allocates. A write allocate is performed when either of these mechanisms detects that a pending write is to a cacheable area of memory. Before programming the WHCR or changing memory cacheability/writeability, the BIOS must writeback and invalidate the internal cache by using the WBINVD instruction. In addition, the WHCR should enable the write allocate mechanisms only after performing any memory sizing or typing algorithms. # Write Handling Control Register (WHCR) The WHCR contains two fields—the Write Allocate Enable Limit (WAELIM) field, and the Write Allocate Enable 15-to-16-Mbyte (WAE15M) bit (see Figure 2). **Note:** The WHCR register as defined in the Model 6, Model 7, and Model 8/[7:0] has changed in the Model 8/[F:8]. **Note**: Hardware RESET initializes this MSR to all zeros. Figure 2. Write Handling Control Register (WHCR) – MSR C000\_0082h (Model 8/[F:8]) **Write Allocate Enable Limit.** The WAELIM field is 10 bits wide. This field, multiplied by 4 Mbytes, defines an upper memory limit. Any pending write cycle that addresses memory below this limit causes the processor to perform a write allocate. Write allocate is disabled for memory accesses at and above this limit unless the processor determines a pending write cycle is cacheable by means of one of the other write allocate mechanisms—Write to a Cacheable Page and Write to a Sector (for more information, see the Cache chapter in the *AMD-K6*®-2 *Processor Data Sheet*, order# 21850). The maximum value of this limit is $((2^{10}-1) \cdot 4)$ Mbytes) = 4092 Mbytes. When all the bits in this field are set to 0, all memory is above this limit and the write allocate mechanism is disabled. Once the BIOS determines the amount of RAM installed in the system, this number should also be used to program the WAELIM field. For example, a system with 32 Mbytes of RAM would program the WAELIM field with the value 00\_0000\_1000b. This value (8), when multiplied by 4 Mbytes, yields 32 Mbytes as the write allocate limit. Write Allocate Enable 15-to-16-Mbyte. The WAE15M bit is used to enable write allocations for the memory write cycles that address the 1 Mbyte of memory between 15 Mbytes and 16 Mbytes. This bit must be set to 1 to allow write allocates in this memory area. This sub-mechanism of the WAELIM provides a memory hole to prevent write allocates. This memory hole is provided to account for a small number of uncommon memory-mapped I/O adapters that use this particular memory address space. If the system contains one of these peripherals, the bit should be set to 0. The WAE15M bit is ignored if the value in the WAELIM field is set to less than 16 Mbytes. By definition, write allocations are never performed in the memory area between 640 Kbytes and 1 Mbyte unless the processor determines a pending write cycle is cacheable by means of Write to a Cacheable Page or Write to a Sector. It is not safe to perform write allocations between 640 Kbytes and 1 Mbyte (000A\_0000h to 000F\_FFFFh) because it is considered a noncacheable region of memory. Additionally, if a memory region is defined as write-combinable or uncacheable by a Memory Type Range Register (MTRR), write allocates are not performed in that region. # Step 3: AMD-K6<sup>®</sup> Processor 6 The BIOS programmer has several options regarding what the end-user can control. The BIOS can provide the end-user with a setup screen option to enable/disable write allocate or options to define the Write Allocate Enable Limit field and set the Write Allocate Enable 15-to-16-Mbyte bit. The BIOS can also automatically enable and setup the write allocate feature and its registers without end-user intervention. This automatic setup is recommended. To disable all write allocate features for the AMD-K6 processor, the WHCR must be set to $0000\_0000\_0000\_00000\_0000h$ —the default value after power-on reset. # **AMD-K6<sup>®</sup> Processor Programming Example for Write Allocate Registers** The following cases show examples of programming the write allocate feature for three types of systems: #### Case 1 For AMD-K6 processor Models 6 and 7 and AMD-K6-2 processor Model 8/[7:0] systems that have a 1-Mbyte memory hole starting at the 15-Mbyte boundary, and 32 Mbytes of total memory: - Program the WHCR MSR (ECX=C000\_0082h) with WCDE=0, WAELIM=8, and WAE15M=0 - Use the WRMSR instruction and the 64-bit hex value 0000\_0000\_0000\_0010h ## **Code Sample** ``` :flush cache PUSHF ;save state CLI ; disable interrupts WBINVD :write back and invalidate cache ;set Write Allocate Limit and clear WAE15M bit MOV ECX.0C0000082H MOV EAX,10H ;WCDE=0,WAELIM=8,WAE15M=0 XOR EDX, EDX WRMSR POPF ;restore original state ``` #### Case 2 For AMD-K6 processor Models 6 and 7 and AMD-K6-2 processor Model 8/[7:0] systems that do not have a memory hole starting at the 15-Mbyte boundary, and have 16 Mbytes of total memory: - Program the WHCR MSR (ECX=C000\_0082h) with WCDE=0, WAELIM=4, and WAE15M=1 - Use the WRMSR instruction and the 64-bit hex value 0000\_0000\_0000\_0009h ## **Code Sample** ``` :flush cache PUSHF :save state CLI ;disable interrupts ;write back and invalidate cache WBINVD ;set Write Allocate Limit and set WAE15M bit MOV ECX.0C0000082H MOV EAX,09H ;WCDE=0,WAELIM=4,WAE15M=1 EDX, EDX XOR WRMSR POPF ;restore original state ``` #### Case 3 For AMD-K6-2 processor Model 8/[F:8] systems that do not have a memory hole starting at the 15-Mbyte boundary, and have 64 Mbytes of total memory: - Program the WHCR MSR (ECX=C000\_0082h) with WAELIM=16d and WAE15M=1 - Use the WRMSR instruction and the 64-bit hex value 0000\_0000\_0401\_0000h ## **Code Sample** ;flush cache PUSHF ;save state CLI ; disable interrupts WBINVD :write back and invalidate cache ;set Write Allocate Limit and set WAE15M bit MOV ECX,0C0000082H MOV EAX,04010000H ; WAELIM=16d, WAE15M=1 XOR EDX, EDX WRMSR POPF ;restore original state # Step 2: AMD-K5™ Processor The AMD-K5 processor implements write allocate by providing a global write allocate enable bit, three range-protection enable bits, and two memory range registers. The global write allocate enable bit is accessed using the Hardware Configuration Register (HWCR). The memory range registers and range enable bits are programmed by read/write MSR instructions. The Write Allocate Enable bit (bit 4 of HWCR) should be set to 0, which prevents potential erroneous behavior in the case of a warm boot during write allocate initialization. Two MSRs are defined to support write allocate. The MSRs are accessed using the RDMSR and WRMSR instructions (see "RDMSR and WRMSR" in the *AMD-K5*<sup>TM</sup> *Processor Software Development Guide*, order# 20007). The following index values in the ECX register access the MSRs: - Write Allocate Top-of-Memory and Control Register (WATMCR)—ECX = 85h - Write Allocate Programmable Memory Range Register (WAPMRR)—ECX = 86h Three non-write-allocatable memory ranges are defined for use with the write allocate feature—one fixed range and two programmable ranges. **Fixed Range.** The fixed memory range is $000A_0000h_000F_FFFFh$ and can be enabled or disabled. When enabled, write allocate can not be performed in this range. This region of memory, which includes standard VGA and other peripheral and BIOS access, is considered noncacheable. Performing a write allocate in this area can cause compatibility problems. It is recommended that this bit be enabled (set to 1) to prevent write allocate to this range. Set bit 16 of WATMCR to enable protection of this range. **Programmable Range.** One programmable memory range is xxxx\_0000h-yyyy\_FFFFh, where xxxx and yyyy are defined using bits 15-0 and bits 31-16 of WAPMRR, respectively. Set bit 17 of WATMCR to enable protection of this range. When enabled, write allocate can not be performed in this range. This programmable memory range exists because a small number of uncommon memory-mapped I/O adapters are mapped to physical RAM locations. If a card like this exists in the system configuration, it is recommended that the BIOS program the 'memory hole' for the adapter into this non-write-allocatable range. **Top of Memory.** The other programmable memory range is defined by the 'top-of-memory' field. The top of memory is equal to zzzz\_0000h, where zzzz is defined using bits 15-0 of WATMCR. Addresses above zzzz\_0000h are protected from write allocate when bit 18 of WATMCR is enabled. Once the BIOS determines the size of RAM installed in the system, this size should also be used to program the top of memory. For example, a system with 32 Mbytes of RAM requires that the top-of-memory field be programmed with a value of 0200h, which enables protection from write allocate for memory above that value. Set bit 18 of WATMCR to enable protection of this range. Caching and write allocate are generally not performed for the memory above the amount of physical RAM in the system. Video frame buffers are usually mapped above physical RAM. If write allocate were attempted in that memory area, there could be performance degradation or compatibility problems. Bits 18–16 of WATMCR control the enabling or disabling of the three memory ranges as follows: - Bit 18: Top-of-Memory Enable bit - 0 = disabled (default) - 1 = enabled (write allocate can not be performed above Top of Memory) - Bit 17: Programmable Range Enable bit - 0 = disabled (default) - 1 = enabled (write allocate can not be performed in this range) - Bit 16: Fixed Range Enable bit - 0 = disabled (default) - 1 = enabled (write allocate can not be performed in this range) Figures 3 and 4 show the bit positions for these two new registers. Figure 3. Write Allocate Top-of-Memory and Control Register (WATMCR) – MSR 85h Figure 4. Write Allocate Programmable Memory Range Register (WAPMRR) – MSR 86h # Step 3: AMD-K5™ Processor All of the write allocate features in the AMD-K5 processor are enabled by setting bit 4 (WA) of the HWCR (MSR 83h) to 1. For more information on the HWCR, see "Hardware Configuration Register" in the AMD-K5<sup>TM</sup> Processor Software Development Guide, order# 20007. Figure 5 shows the definition of HWCR. The BIOS programmer has several options regarding what the end-user can control. The BIOS can provide the end-user with a setup screen option to enable write allocate. The BIOS can provide the end-user with a setup screen option to also setup the other features (programmable ranges and fixed range). The BIOS can automatically enable and setup the write allocate feature and its registers without end-user intervention. This automatic setup is recommended. Figure 5. Hardware Configuration Register (HWCR) – MSR 83h # **AMD-K5™ Processor Programming Example for Write Allocate Registers** The following cases show examples of programming the write allocate feature for two types of systems: #### Case 1 For systems without a memory hole and 16 Mbytes of total memory: - Program the WATMCR MSR (ECX=85h) with top of memory (0100h) and enable bits (0005h) to protect the fixed range and above the top of memory - Use the WRMSR instruction and the 64-bit hex value 0000\_0000\_0005\_0100h **Note:** For 8-Mbyte systems, program 0080h in the lowest 16 bits. For 32-Mbyte systems, program 0200h in the lowest 16 bits. ## **Code Sample** ``` :disable WA bit (bit 4 of HWCR) MOV ECX,83H ;read HWCR (83h) RDMSR AND EAX, NOT 10H WRMSR ;program top-of-memory and control bits ECX,85H MOV ;select WATMCR MOV EAX,50100H ;TME=1,PRE=0,FRE=1,TOM=0100h XOR EDX, EDX WRMSR ``` #### Case 2 For systems with a 1-Mbyte memory hole starting at the 15 Mbyte boundary and 32 Mbytes of total memory: - Program the WAPMRR MSR (ECX=86h) with 15 Mbytes (00F0h) to 16 Mbytes –1 (00FFh) - Use the WRMSR instruction and the 64-bit hex value 0000\_0000\_00FF\_00F0h - Program the WATMCR MSR (ECX=85h) with top of memory (0200h) and all enable bits (0007h) to protect above the top of memory, the fixed range, and the programmable range - Use the WRMSR instruction and the 64-bit hex value 0000\_0000\_0007\_0200h **Note:** For 8-Mbyte systems, program 0080h in the lowest 16 bits. For 16-Mbyte systems, program 0100h in the lowest 16 bits. # **Code Sample** ``` ;disable WA bit (bit 4 of HWCR) MOV ECX,83H ;read HWCR (83h) RDMSR AND EAX, NOT 10H WRMSR ;program programmable range to 15-16Mbytes MOV ECX.86H ;select WAPMRR MOV EAX.OFFOOFOH :address from F00000 to FFFFF XOR EDX.EDX :clear WRMSR ;program top of memory and control bits ECX,85H ;select WATMCR MOV ;TME=1,PRE=1,FRE=1,TOM=0200h MOV EAX,70200H EDX.EDX XOR :clear WRMSR :enable WA bit MOV ECX,83H ;read HWCR (83h) RDMSR 0 R EAX,10H ;set bit 4 WRMSR ``` 21326E/0-November 1998